677 research outputs found
Recommended from our members
Exploiting iteration-level parallelism in dataflow programs
The term "dataflow" generally encompasses three distinct aspects of computation - a data-driven model of computation, a functional/declarative programming language, and a special-purpose multiprocessor architecture. In this paper we decouple the language and architecture issues by demonstrating that declarative programming is a suitable vehicle for the programming of conventional distributed-memory multiprocessors.This is achieved by appling several transformations to the compiled declarative program to achieve iteration-level (rather than instruction-level) parallelism. The transformations first group individual instructions into sequential light-weight processes, and then insert primitives to: (1) cause array allocation to be distributed over multiple processors, (2) cause computation to follow the data distribution by inserting an index filtering mechanism into a given loop and spawning a copy of it on all PEs; the filter causes each instance of that loop to operate on a different subrange of the index variable.The underlying model of computation is a dataflow/von Neumann hybrid in that exection within a process is control-driven while the creation, blocking, and activation of processes is data-driven.The performance of this process-oriented dataflow system (PODS) is demonstrated using the hydrodynamics simulation benchmark called SIMPLE, where a 19-fold speedup on a 32-processor architecture has been achieved
Recommended from our members
Exploiting iteration-level parallelism in declarative programs
In order to achieve viable parallel processing three basic criteria must be met: (1) the system must provide a programming environment which hides the details of parallel processing from the programmer; (2) the system must execute efficiently on the given hardware; and (3) the system must be economically attractive.The first criterion can be met by providing the programmer with an implicit rather than explicit programming paradigm. In this way ali of the synchronization and distribution are handled automatically. To meet the second criterion, the system must perform synchronization and distribution in such a way that the available computing resources are used to their utmost. And to meet the third criterion, the system must not require esoteric or expensive hardware to achieve efficient utilization.This dissertation reports on the Process-Oriented Dataflow System (PODS), which meets all of the above criteria. PODS uses a hybrid von Neumann-Dataflow model of computation supported by an automatic partitioning and distribution scheme. The new partitioning and distribution algorithm is presented along with the underlying principles. Four new mechanisms for distribution are presented: (1) a distributed array allocation operator for data distribution; (2) a distributed L operator for code distribution; (3) a range filter for restriction index ranges for different PEs; and (4) a specialized apply operator for functional parallelism.Simulations show that PODS balances communication overhead with distributed processing to achieve efficient parallel execution on distributed memory multiprocessors. This is partially due to a new software array caching scheme, called remote caching, which greatly reduces the amount of remote memory reads. PODS is designed to use off-the-shelf components, with no specialized hardware. In this way a real PODS machine can be built quickly and cost effectively. The system is currently being retargeted to the Intel iPSC/2 so that it can be run on commercially available equipment
Recommended from our members
Executing matrix multiply on a process oriented data flow machine
The Process-Oriented Dataflow System (PODS) is an execution model that combines the von Neumann and dataflow models of computation to gain the benefits of each. Central to PODS is the concept of array distribution and its effects on partitioning and mapping of processes.In PODS arrays are partitioned by simply assigning consecutive elements to each processing element (PE) equally. Since PODS uses single assignment, there will be only one producer of each element. This producing PE owns that element and will perform the necessary computations to assign it. Using this approach the filling loop is distributed across the PEs. This simple partitioning and mapping scheme provides excellent results for executing scientific code on MIMD machines. In this way PODS allows MIMD machines to exploit vector and data parallelism easily while still providing the flexibility of MIMD over SIMD for multi-user systems.In this paper, the classic matrix multiply algorithm, with 1024 data points, is executed on a PODS simulator and the results are presented and discussed. Matrix multiply is a good example because it has several interesting properties: there are multiple code-blocks; a new array must be dynamically allocated and distributed; there is a loop-carried dependency in the innermost loop; the two input arrays have different access patterns; and the sizes of the input arrays are not known at compile time. Matrix multiply also forms the basis for many important scientific algorithms such as: LU decomposition, convolution, and the Fast-Fourier Transform.The results show that PODS is comparable to both Iannucci's Hybrid Architecture and MIT's TTDA in terms of overhead and instruction power. They also show that PODS easily distributes the work load evenly across the PEs. The key result is that PODS can scale matrix multiply in a near linear fashion until there is little or no work to be performed for each PE. Then overhead and message passing become a major component of the execution time. With larger problems (e.g., >/=16k data points) this limit would be reached at around 256 PEs
Recommended from our members
Automatic data/program partitioning using the single assignment principle
Loosely-coupled MIMD architectures do not suffer from memory contention; hence large numbers of processors may be utilized. The main problem, however, is how to partition data and programs in order to exploit the available parallelism. In this paper we show that efficient schemes for automatic data/program partitioning and synchronization may be employed if single assignment is used. Using simulations of program loops common to scientific computations (the Livermore Loops), we demonstrate that only a small fraction of data accesses are remote and thus the degradation in network performance due to multiprocessing is minimal
In-situ regeneration of activated carbon with electric potential swing desorption (EPSD) for the H 2 S removal from biogas
In-situ regeneration of a granular activated carbon was conducted for the first time using electric potential swing desorption (EPSD) with potentials up to 30 V. The EPSD system was compared against a standard non-potential system using a fixed-bed reactor with a bed of 10 g of activated carbon treating a gas mixture with 10,000 ppm H2S. Breakthrough times, adsorption desorption volume, capacities, effect of regeneration and desorption kinetics were investigated. The analysis showed that desorption of H2S using the new EPSD system was 3 times quicker compared with the no potential system. Hence, physical adsorption using EPSD over activated carbon is efficient, safe and environmental friendly and could be used for the in-situ regeneration of granular activated carbon without using a PSA and/or TSA system. Additionally, adsorption and desorption cycles can be obtained with a classical two column system, which could lead towards a more efficient and economic biogas to biomethane process
Effectiveness of EDACS Versus ADAPT Accelerated Diagnostic Pathways for Chest Pain: A Pragmatic Randomized Controlled Trial Embedded Within Practice
Study objective
A 2-hour accelerated diagnostic pathway based on the Thrombolysis in Myocardial Infarction score, ECG, and troponin measures (ADAPT-ADP) increased early discharge of patients with suspected acute myocardial infarction presenting to the emergency department compared with standard care (from 11% to 19.3%). Observational studies suggest that an accelerated diagnostic pathway using the Emergency Department Assessment of Chest Pain Score (EDACS-ADP) may further increase this proportion. This trial tests for the existence and size of any beneficial effect of using the EDACS-ADP in routine clinical care.
Methods
This was a pragmatic randomized controlled trial of adults with suspected acute myocardial infarction, comparing the ADAPT-ADP and the EDACS-ADP. The primary outcome was the proportion of patients discharged to outpatient care within 6 hours of attendance, without subsequent major adverse cardiac event within 30 days.
Results
Five hundred fifty-eight patients were recruited, 279 in each arm. Sixty-six patients (11.8%) had a major adverse cardiac event within 30 days (ADAPT-ADP 29; EDACS-ADP 37); 11.1% more patients (95% confidence interval 2.8% to 19.4%) were identified as low risk in EDACS-ADP (41.6%) than in ADAPT-ADP (30.5%). No low-risk patients had a major adverse cardiac event within 30 days (0.0% [0.0% to 1.9%]). There was no difference in the primary outcome of proportion discharged within 6 hours (EDACS-ADP 32.3%; ADAPT-ADP 34.4%; difference −2.1% [−10.3% to 6.0%], P=.65).
Conclusion
There was no difference in the proportion of patients discharged early despite more patients being classified as low risk by the EDACS-ADP than the ADAPT-ADP. Both accelerated diagnostic pathways are effective strategies for chest pain assessment and resulted in an increased rate of early discharges compared with previously reported rates
Demonstration of the temporal matter-wave Talbot effect for trapped matter waves
We demonstrate the temporal Talbot effect for trapped matter waves using
ultracold atoms in an optical lattice. We investigate the phase evolution of an
array of essentially non-interacting matter waves and observe matter-wave
collapse and revival in the form of a Talbot interference pattern. By using
long expansion times, we image momentum space with sub-recoil resolution,
allowing us to observe fractional Talbot fringes up to 10th order.Comment: 17 pages, 7 figure
Azimuthal anisotropy at RHIC: the first and fourth harmonics
We report the first observations of the first harmonic (directed flow, v_1),
and the fourth harmonic (v_4), in the azimuthal distribution of particles with
respect to the reaction plane in Au+Au collisions at the Relativistic Heavy Ion
Collider (RHIC). Both measurements were done taking advantage of the large
elliptic flow (v_2) generated at RHIC. From the correlation of v_2 with v_1 it
is determined that v_2 is positive, or {\it in-plane}. The integrated v_4 is
about a factor of 10 smaller than v_2. For the sixth (v_6) and eighth (v_8)
harmonics upper limits on the magnitudes are reported.Comment: 6 pages with 3 figures, as accepted for Phys. Rev. Letters The data
tables are at
http://www.star.bnl.gov/central/publications/pubDetail.php?id=3
Pion, kaon, proton and anti-proton transverse momentum distributions from p+p and d+Au collisions at GeV
Identified mid-rapidity particle spectra of , , and
from 200 GeV p+p and d+Au collisions are reported. A
time-of-flight detector based on multi-gap resistive plate chamber technology
is used for particle identification. The particle-species dependence of the
Cronin effect is observed to be significantly smaller than that at lower
energies. The ratio of the nuclear modification factor () between
protons and charged hadrons () in the transverse momentum
range GeV/c is measured to be
(stat)(syst) in minimum-bias collisions and shows little
centrality dependence. The yield ratio of in minimum-bias d+Au
collisions is found to be a factor of 2 lower than that in Au+Au collisions,
indicating that the Cronin effect alone is not enough to account for the
relative baryon enhancement observed in heavy ion collisions at RHIC.Comment: 6 pages, 4 figures, 1 table. We extended the pion spectra from
transverse momentum 1.8 GeV/c to 3. GeV/
IgE allergy diagnostics and other relevant tests in allergy, a World Allergy Organization position paper
Currently, testing for immunoglobulin E (IgE) sensitization is the cornerstone of diagnostic evaluation in suspected allergic conditions. This review provides a thorough and updated critical appraisal of the most frequently used diagnostic tests, both in vivo and in vitro. It discusses skin tests, challenges, and serological and cellular in vitro tests, and provides an overview of indications, advantages and disadvantages of each in conditions such as respiratory, food, venom, drug, and occupational allergy. Skin prick testing remains the first line approach in most instances; the added value of serum specific IgE to whole allergen extracts or components, as well as the role of basophil activation tests, is evaluated. Unproven, non-validated, diagnostic tests are also discussed. Throughout the review, the reader must bear in mind the relevance of differentiating between sensitization and allergy; the latter entails not only allergic sensitization, but also clinically relevant symptoms triggered by the culprit allergen
- …